Search CORE

2 research outputs found

Extended Parallel Corpus for Amharic-English Machine Translation

Author: Bati Tesfaye Bayu
Gezmu Andargachew Mekonnen
Nürnberger Andreas
Publication venue
Publication date: 25/06/2021
Field of study

This paper describes the acquisition, preprocessing, segmentation, and alignment of an Amharic-English parallel corpus. It will be useful for machine translation of an under-resourced language, Amharic. The corpus is larger than previously compiled corpora; it is released for research purposes. We trained neural machine translation and phrase-based statistical machine translation models using the corpus. In the automatic evaluation, neural machine translation models outperform phrase-based statistical machine translation models.Comment: Accepted to 2nd AfricanNLP workshop at EACL 202

arXiv.org e-Print Archive